Skip to main content

View Post [edit]

Poster: jrwebmaster Date: Nov 12, 2016 8:34pm
Forum: audio Subject: Problem with file link - local path is being used in archive.org link, causing problems

I bulk uploaded a bunch of files using the ia command-line tool and a .csv file on my local machine. The "file" path for each item in the .csv file included a full path, including a windows drive letter (H:\etc).

It never occurred to me that the local Windows path might somehow be used by archive.org in storing the item (as opposed to being used just to find the file and upload it). Apparently I was wrong, and this has created two problems.

Here's a sample item: https://archive.org/details/JR12-2004-03-11-SER1-dismantling-patterns-part-2-divine-abodes-judith

The item looks fine until you try to download the VBR MP3 (i.e., the original file) from the link in the sidebar. If you do, you get page not found. The URL the sidebar links to is this: https://ia601507.us.archive.org/22/items/JR12-2004-03-11-SER1-dismantling-patterns-part-2-divine-abodes-judith/H:/web_dev/judith-ragir-audio-revised/JR12/JR12-2004-03-11-SER1-dismantling-patterns-part-2-divine-abodes-judith.mp3

I'm guessing that the colon (H:) is a problem, and perhaps also the slashes.

If you click "show all" in the sidebar, you get a new page, and the link for the VBR MP3 on that page does work: https://archive.org/download/JR12-2004-03-11-SER1-dismantling-patterns-part-2-divine-abodes-judith/H%3a%5cweb_dev%5cjudith-ragir-audio-revised%5cJR12%5cJR12-2004-03-11-SER1-dismantling-patterns-part-2-divine-abodes-judith.mp3

As you can see, this link uses character entities instead of special characters.

The fact that archive.org used the full path when I uploaded these files seems like a bad design choice.

The only way I can think of fixing this is to delete all of these files and reupload them without using the path for the file (i.e., by executing ia in the directory with the files, so I don't need the path).

Is there any other option? And is there any way you could fix this design flaw? I don't know if the flaw is in the ia command-line tool or in the archive.org backend. But in any case, it does not make sense to use a full pathname to generate an archive.org internal address for a saved file. The archive.org system should be smart enough to just use the filename and to disregard all of the path information.

Thanks.

Reply [edit]

Poster: jakej Date: Nov 15, 2016 10:42am
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

Hi jrwebmaster,

This is a bug in the 'ia' command-line tool. The issue is that ia expects forward slashes ("/") in filenames when it generates the remote filename for the file.

Thanks for the report, I've created an issue for this bug on github: https://github.com/jjjake/internetarchive/issues/163

I will try to get this sorted soon, but I think you're correct that you could re-upload the files without using the path (i.e. by executing ia in the dir with the files...) to get around this issue for now.

Thanks again for the report, I appreciate it! : )

Reply [edit]

Poster: bigbearii Date: May 1, 2017 10:16am
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

Hi Jake,

Curious how to get a hold of you. I am interested in including one of the songs you archived on a soundtrack that i'm producing for a podcast called Crimetown.

Do you have an email address I can reach you at?

Thanks
Matthew

Reply [edit]

Poster: jakej Date: Nov 17, 2016 12:11pm
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

Hi again jrwebmaster,

I just merged a pull request that should address your issue: https://github.com/jjjake/internetarchive/commit/2f927ca781bb987a7897ec915702917710d21bc2

It hasn't been shipped yet, but you can try it out by cloning https://github.com/jjjake/internetarchive and using the master branch.

I'll try to get a new version released soon.

Reply [edit]

Poster: jrwebmaster Date: Nov 17, 2016 12:32pm
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

Thanks! That was fast.

Reply [edit]

Poster: Jeff Kaplan Date: Nov 15, 2016 10:27am
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

if you email the spreadsheet to info@archive.org we might be able to help.

Reply [edit]

Poster: jrwebmaster Date: Dec 13, 2016 10:34am
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

I emailed the spreadsheet almost a month ago. I then exchanged a couple of emails with Jake Johnson to try to sort this out.

It has been 22 days now since Jake's last email to me. Since then I have sent two followup emails and received no response. I sent my most-recent followup email 3 days ago to Jake and to two other addresses at archive.org.

Please, could someone help me get this sorted out? It really should not be that time consuming. It's just a matter of bulk deleting some files, and I have provided a spreadsheet to be used in scripting the bulk deletion.

Reply [edit]

Poster: jrwebmaster Date: Dec 16, 2016 3:59pm
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

Jake took care of deleting the badly named files for the items in 1 of 2 collections. Now I just need the same bulk deletion done for items in the second collection.

I emailed Jake and info@archive.org with the request, as well as with a CSV file containing a list of the 203 items that need files deleted. I am posting here as well to ensure that my request makes it into the queue. Once that request is taken care of, this whole mess will be taken care of and I won't need any more support on this. Thanks so much!

Thanks so much!

Reply [edit]

Poster: jrwebmaster Date: Nov 15, 2016 10:38am
Forum: audio Subject: Re: Problem with file link - local path is being used in archive.org link, causing problems

Okay, I will do that. Thanks.